Support Vector Machines

November 11 + 13, 2024

Jo Hardin

Agenda 11/11/2024

  1. linearly separable
  2. dot products
  3. support vector formulation

tidymodels syntax

  1. partition the data
  2. build a recipe
  3. select a model
  4. create a workflow
  5. fit the model
  6. validate the model

Support Vector Machines

SVMs create both linear and non-linear decision boundaries. They are incredibly efficient because of the kernel trick which allows the computation to be done in a high dimension.

Deriving SVM formulation

\(\rightarrow\) see class notes for all technical details

  • Mathematics of the optimization to find the widest linear boundary in a space where the two groups are completely separable.

  • Note from derivation: both the optimization and the application are based on dot products.

  • Transform the data to a higher space so that the points are linearly separable. Perform SVM in that space.

  • Recognize that “performing SVM in higher space” is exactly equivalent to using a kernel in the original dimension.

  • Allow for points to cross the boundary using soft margins.

Agenda 11/13/24

  1. not linearly separable (SVM)
  2. kernels (SVM)
  3. support vector formulation

Algorithm: Support Vector Machine

  1. Using cross validation, find values of \(C, \gamma, d, r\), etc. (and the kernel function!)
  2. Using Lagrange multipliers (read: the computer), solve for \(\alpha_i\) and \(b\).
  3. Classify an unknown observation (\({\bf u}\)) as “positive” if: \[\sum \alpha_i y_i \phi({\bf x}_i) \cdot \phi({\bf u}) + b = \sum \alpha_i y_i K({\bf x}_i, {\bf u}) + b \geq 0\]

SVM example w defaults

penguin_svm_recipe <-
  recipe(sex ~ bill_length_mm + bill_depth_mm + flipper_length_mm +
           body_mass_g, data = penguin_train) |>
  step_normalize(all_predictors())

summary(penguin_svm_recipe)
# A tibble: 5 × 4
  variable          type      role      source  
  <chr>             <list>    <chr>     <chr>   
1 bill_length_mm    <chr [2]> predictor original
2 bill_depth_mm     <chr [2]> predictor original
3 flipper_length_mm <chr [2]> predictor original
4 body_mass_g       <chr [2]> predictor original
5 sex               <chr [3]> outcome   original
penguin_svm_lin <- svm_linear() |>
  set_engine("LiblineaR") |>
  set_mode("classification")

penguin_svm_lin
Linear Support Vector Machine Model Specification (classification)

Computational engine: LiblineaR 
penguin_svm_lin_wflow <- workflow() |>
  add_model(penguin_svm_lin) |>
  add_recipe(penguin_svm_recipe)

penguin_svm_lin_wflow
══ Workflow ════════════════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: svm_linear()

── Preprocessor ────────────────────────────────────────────────────────────────
1 Recipe Step

• step_normalize()

── Model ───────────────────────────────────────────────────────────────────────
Linear Support Vector Machine Model Specification (classification)

Computational engine: LiblineaR 
penguin_svm_lin_fit <- 
  penguin_svm_lin_wflow |>
  fit(data = penguin_train)

penguin_svm_lin_fit 
══ Workflow [trained] ══════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: svm_linear()

── Preprocessor ────────────────────────────────────────────────────────────────
1 Recipe Step

• step_normalize()

── Model ───────────────────────────────────────────────────────────────────────
$TypeDetail
[1] "L2-regularized L2-loss support vector classification dual (L2R_L2LOSS_SVC_DUAL)"

$Type
[1] 1

$W
     bill_length_mm bill_depth_mm flipper_length_mm body_mass_g       Bias
[1,]       0.248908      1.080195        -0.2256375    1.328448 0.06992734

$Bias
[1] 1

$ClassNames
[1] male   female
Levels: female male

$NbClass
[1] 2

attr(,"class")
[1] "LiblineaR"

Fit again

══ Workflow [trained] ══════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: svm_linear()

── Preprocessor ────────────────────────────────────────────────────────────────
1 Recipe Step

• step_normalize()

── Model ───────────────────────────────────────────────────────────────────────
$TypeDetail
[1] "L2-regularized L2-loss support vector classification dual (L2R_L2LOSS_SVC_DUAL)"

$Type
[1] 1

$W
     bill_length_mm bill_depth_mm flipper_length_mm body_mass_g       Bias
[1,]       0.248908      1.080195        -0.2256375    1.328448 0.06992734

$Bias
[1] 1

$ClassNames
[1] male   female
Levels: female male

$NbClass
[1] 2

attr(,"class")
[1] "LiblineaR"

SVM example w CV tuning (RBF kernel)

penguin_svm_recipe <-
  recipe(sex ~ bill_length_mm + bill_depth_mm + flipper_length_mm +
           body_mass_g, data = penguin_train) |>
  step_normalize(all_predictors())

summary(penguin_svm_recipe)
# A tibble: 5 × 4
  variable          type      role      source  
  <chr>             <list>    <chr>     <chr>   
1 bill_length_mm    <chr [2]> predictor original
2 bill_depth_mm     <chr [2]> predictor original
3 flipper_length_mm <chr [2]> predictor original
4 body_mass_g       <chr [2]> predictor original
5 sex               <chr [3]> outcome   original
penguin_svm_rbf <- svm_rbf(cost = tune(),
                           rbf_sigma = tune()) |>
  set_engine("kernlab") |>
  set_mode("classification")

penguin_svm_rbf
Radial Basis Function Support Vector Machine Model Specification (classification)

Main Arguments:
  cost = tune()
  rbf_sigma = tune()

Computational engine: kernlab 
penguin_svm_rbf_wflow <- workflow() |>
  add_model(penguin_svm_rbf) |>
  add_recipe(penguin_svm_recipe)

penguin_svm_rbf_wflow
══ Workflow ════════════════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: svm_rbf()

── Preprocessor ────────────────────────────────────────────────────────────────
1 Recipe Step

• step_normalize()

── Model ───────────────────────────────────────────────────────────────────────
Radial Basis Function Support Vector Machine Model Specification (classification)

Main Arguments:
  cost = tune()
  rbf_sigma = tune()

Computational engine: kernlab 
set.seed(234)
penguin_folds <- vfold_cv(penguin_train,
                          v = 4)
# the tuned parameters also have default values you can use
penguin_grid <- grid_regular(cost(),
                             rbf_sigma(),
                             levels = 8)

penguin_grid
# A tibble: 64 × 2
        cost     rbf_sigma
       <dbl>         <dbl>
 1  0.000977 0.0000000001 
 2  0.00431  0.0000000001 
 3  0.0190   0.0000000001 
 4  0.0841   0.0000000001 
 5  0.371    0.0000000001 
 6  1.64     0.0000000001 
 7  7.25     0.0000000001 
 8 32        0.0000000001 
 9  0.000977 0.00000000268
10  0.00431  0.00000000268
# ℹ 54 more rows
# this takes a few minutes
penguin_svm_rbf_tune <- 
  penguin_svm_rbf_wflow |>
  tune_grid(resamples = penguin_folds,
            grid = penguin_grid)

penguin_svm_rbf_tune 
# Tuning results
# 4-fold cross-validation 
# A tibble: 4 × 4
  splits           id    .metrics           .notes          
  <list>           <chr> <list>             <list>          
1 <split [186/63]> Fold1 <tibble [192 × 6]> <tibble [0 × 3]>
2 <split [187/62]> Fold2 <tibble [192 × 6]> <tibble [0 × 3]>
3 <split [187/62]> Fold3 <tibble [192 × 6]> <tibble [0 × 3]>
4 <split [187/62]> Fold4 <tibble [192 × 6]> <tibble [0 × 3]>

SVM model output

penguin_svm_rbf_tune |>
  collect_metrics() |>
  filter(.metric == "accuracy") |>
  ggplot() + 
  geom_line(aes(color = as.factor(cost), y = mean, x = rbf_sigma)) +
  geom_point(aes(color = as.factor(cost), y = mean, x = rbf_sigma)) +
  labs(color = "Cost") +
  scale_x_continuous(trans='log10')

SVM model output - take two

penguin_svm_rbf_tune |>
  collect_metrics() |>
  filter(.metric == "accuracy") |>
  ggplot() + 
  geom_line(aes(color = as.factor(rbf_sigma), y = mean, x = cost)) +
  geom_point(aes(color = as.factor(rbf_sigma), y = mean, x = cost)) +
  labs(color = "Cost") +
  scale_x_continuous(trans='log10')

SVM Final model

penguin_svm_rbf_best <- finalize_model(
  penguin_svm_rbf,
  select_best(penguin_svm_rbf_tune, metric = "accuracy"))

penguin_svm_rbf_best
Radial Basis Function Support Vector Machine Model Specification (classification)

Main Arguments:
  cost = 0.371498572284237
  rbf_sigma = 1

Computational engine: kernlab 
penguin_svm_rbf_final <-
  workflow() |>
  add_model(penguin_svm_rbf_best) |>
  add_recipe(penguin_svm_recipe) |>
  fit(data = penguin_train)

SVM Final model

penguin_svm_rbf_final
══ Workflow [trained] ══════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: svm_rbf()

── Preprocessor ────────────────────────────────────────────────────────────────
1 Recipe Step

• step_normalize()

── Model ───────────────────────────────────────────────────────────────────────
Support Vector Machine object of class "ksvm" 

SV type: C-svc  (classification) 
 parameter : cost C = 0.371498572284237 

Gaussian Radial Basis kernel function. 
 Hyperparameter : sigma =  1 

Number of Support Vectors : 137 

Objective Function Value : -31.8005 
Training error : 0.052209 
Probability model included. 

Test predictions

penguin_svm_rbf_final |>
  predict(new_data = penguin_test) |>
  cbind(penguin_test) |>
  select(sex, .pred_class) |>
  table()
        .pred_class
sex      female male
  female     39    5
  male        4   36
penguin_svm_rbf_final |>
  predict(new_data = penguin_test) |>
  cbind(penguin_test) |>
  conf_mat(sex, .pred_class)
          Truth
Prediction female male
    female     39    4
    male        5   36

Other measures

# https://yardstick.tidymodels.org/articles/metric-types.html
class_metrics <- metric_set(accuracy, sensitivity, 
                            specificity, f_meas)

penguin_svm_rbf_final |>
  predict(new_data = penguin_test) |>
  cbind(penguin_test) |>
  class_metrics(truth = sex, estimate = .pred_class)
# A tibble: 4 × 3
  .metric     .estimator .estimate
  <chr>       <chr>          <dbl>
1 accuracy    binary         0.893
2 sensitivity binary         0.886
3 specificity binary         0.9  
4 f_meas      binary         0.897

Bias-Variance Tradeoff

Test and training error as a function of model complexity. Note that the error goes down monotonically only for the training data. Be careful not to overfit!! image credit: ISLR

Reflecting on Model Building

Image credit: https://www.tmwr.org/

Reflecting on Model Building

Image credit: https://www.tmwr.org/

Reflecting on Model Building

Image credit: https://www.tmwr.org/